PROSE: Perceptual Risk Optimization for Speech Enhancement
نویسندگان
چکیده
The goal in speech enhancement is to obtain an estimateof clean speech starting from the noisy signal by minimizing a chosendistortion measure, which results in an estimate that depends onthe unknown clean signal or its statistics. Since access to suchprior knowledge is limited or not possible in practice, one hasto estimate the clean signal statistics. In this paper, we developa new risk minimization framework for speech enhancement, inwhich, one optimizes an unbiased estimate of the distortion/riskinstead of the actual risk. The estimated risk is expressed solely as afunction of the noisy observations. We consider several perceptuallyrelevant distortion measures and develop corresponding unbiasedestimates under realistic assumptions on the noise distribution anda priori signal-to-noise ratio (SNR). Minimizing the risk estimatesgives rise to the corresponding denoisers, which are nonlinearfunctions of the a posteriori SNR. Perceptual evaluation of speechquality (PESQ), average segmental SNR (SSNR) computations, andlistening tests show that the proposed risk optimization approachemploying Itakura-Saito and weighted hyperbolic cosine distortionsgives better performance than the other distortion measures. ForSNRs greater than 5 dB, the proposed approach gives superiordenoising performance over the benchmark techniques based on theWiener filter, log-MMSE minimization, and Bayesian nonnegativematrix factorization.Index Terms — Speech enhancement, perceptual distortion measure,unbiased risk estimation, Stein’s lemma, objective and subjectiveassessment.
منابع مشابه
A New Shuffled Sub-swarm Particle Swarm Optimization Algorithm for Speech Enhancement
In this paper, we propose a novel algorithm to enhance the noisy speech in the framework of dual-channel speech enhancement. The new method is a hybrid optimization algorithm, which employs the combination of the conventional θ-PSO and the shuffled sub-swarms particle optimization (SSPSO) technique. It is known that the θ-PSO algorithm has better optimization performance than standard PSO al...
متن کاملPerceptual Factor Analysis for Speech Enhancement
This paper presents a new speech enhancement approach originated from factor analysis (FA) framework. FA is a data analysis model where the relevant common factors can be extracted from observations. A factor loading matrix is found and a resulting model error is introduced for each observation. Interestingly, FA is a subspace approach properly representing the noisy speech. This approach parti...
متن کاملSpeech Enhancement Through an Optimized Subspace Division Technique
The speech enhancement techniques are often employed to improve the quality and intelligibility of the noisy speech signals. This paper discusses a novel technique for speech enhancement which is based on Singular Value Decomposition. This implementation utilizes a Genetic Algorithm based optimization method for reducing the effects of environmental noises from the singular vectors as well as t...
متن کاملSpeech Enhancement Through an Optimized Subspace Division Technique
The speech enhancement techniques are often employed to improve the quality and intelligibility of the noisy speech signals. This paper discusses a novel technique for speech enhancement which is based on Singular Value Decomposition. This implementation utilizes a Genetic Algorithm based optimization method for reducing the effects of environmental noises from the singular vectors as well as t...
متن کاملA Heuristic Speech De-noising with the aid of Dual Tree Complex Wavelet Transform using Teaching-Learning Based Optimization
Abstract— In our present work, we propose a nature inspired population based speech enhancement technique to find the dynamic threshold value using Teaching-Learning Based Optimization (TLBO) algorithm by using shift invariant property of dual tree complex wavelet transform (DT-CWT). The performance of these proposed methods are evaluated in terms of Perceptual Evaluation of Speech Quality (PES...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1710.03975 شماره
صفحات -
تاریخ انتشار 2017